Skip to content

feat(unixfs): configurable CID Profiles from IPIP-499#1088

Merged
lidel merged 31 commits intomainfrom
feat/ipip-499-unixfs-2025
Feb 4, 2026
Merged

feat(unixfs): configurable CID Profiles from IPIP-499#1088
lidel merged 31 commits intomainfrom
feat/ipip-499-unixfs-2025

Conversation

@lidel
Copy link
Member

@lidel lidel commented Jan 17, 2026

Adds building blocks for reproducible CID generation across IPFS implementations, based on IPIP-499

Users of boxo can now

  • Choose how directory size is estimated for HAMT sharding: legacy link-based (SizeEstimationLinks), accurate block-based (SizeEstimationBlock), or disable size-based thresholds entirely
    (SizeEstimationDisabled)
  • Apply predefined import settings with UnixFS_v0_2015 or UnixFS_v1_2025 profiles from IPIP-499 via ApplyGlobals() and CidBuilder()
  • Resolve all (not just roots) symlinks to their target content during file traversal with SerialFileOptions.DereferenceSymlinks
  • Get consistent HAMT behavior with JS implementation (threshold comparison changed from >= to >) – 6707376

Related: IPIP-499, used by kubo#11148

lidel added 2 commits January 16, 2026 23:42
add configurable size estimation modes for determining when to switch
between BasicDirectory and HAMTDirectory:

- SizeEstimationLinks: legacy mode using len(name) + len(CID), default
- SizeEstimationBlock: full serialized dag-pb block size (accurate)
- SizeEstimationDisabled: link-count only via MaxLinks, ignores size

includes:
- HAMTSizeEstimation global for default mode
- WithSizeEstimationMode option for per-directory override
- helper functions for accurate protobuf size calculation

part of IPIP-499 UnixFS CID Profiles implementation.
introduces UnixFSProfile struct with predefined profiles:
- UnixFS_v0_2015: legacy CIDv0 settings (256 KiB chunks, 174 links/node)
- UnixFS_v1_2025: modern CIDv1 settings (1 MiB chunks, 1024 links/node)

profiles control file chunking, DAG width, and HAMT sharding parameters.
ApplyGlobals() sets all relevant global variables at once.

part of IPIP-499 implementation.
@lidel lidel changed the title feat(unixfS): configurable CID Profiles from IPIP-499 feat(unixfs): configurable CID Profiles from IPIP-499 Jan 17, 2026
@codecov
Copy link

codecov bot commented Jan 17, 2026

Codecov Report

❌ Patch coverage is 91.82561% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.67%. Comparing base (2688767) to head (f25727f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ipld/unixfs/mod/dagmodifier.go 72.91% 8 Missing and 5 partials ⚠️
ipld/unixfs/io/directory.go 94.24% 6 Missing and 5 partials ⚠️
ipld/unixfs/unixfs.go 66.66% 2 Missing and 1 partial ⚠️
mfs/root.go 95.00% 1 Missing and 1 partial ⚠️
mfs/dir.go 95.45% 1 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1088      +/-   ##
==========================================
+ Coverage   61.16%   61.67%   +0.51%     
==========================================
  Files         264      265       +1     
  Lines       26237    26521     +284     
==========================================
+ Hits        16048    16357     +309     
+ Misses       8516     8478      -38     
- Partials     1673     1686      +13     
Files with missing lines Coverage Δ
chunker/parse.go 53.96% <ø> (ø)
files/serialfile.go 77.55% <100.00%> (+12.11%) ⬆️
ipld/unixfs/hamt/hamt.go 80.25% <100.00%> (+0.85%) ⬆️
ipld/unixfs/io/profile.go 100.00% <100.00%> (ø)
mfs/file.go 66.66% <100.00%> (+4.97%) ⬆️
mfs/ops.go 52.08% <100.00%> (+6.15%) ⬆️
mfs/dir.go 59.17% <95.45%> (+1.85%) ⬆️
mfs/root.go 61.68% <95.00%> (+28.82%) ⬆️
ipld/unixfs/unixfs.go 82.52% <66.66%> (+2.52%) ⬆️
ipld/unixfs/io/directory.go 82.04% <94.24%> (+10.83%) ⬆️
... and 1 more

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lidel added 2 commits January 17, 2026 01:32
add SerialFileOptions and NewSerialFileWithOptions to control whether
symlinks are preserved as UnixFS symlink nodes (Data.Type=4) or
dereferenced and replaced with their target content during file
traversal.
Link.Size is already uint64, so the explicit conversions are redundant
and flagged by golangci-lint unconvert check.
HAMT sharding threshold comparison was historically implemented as
`>` in JS and `>=` in Go:

- JS: https://github.com/ipfs/helia/blob/005c2a7/packages/unixfs/src/commands/utils/is-over-shard-threshold.ts#L31
- Go: https://github.com/ipfs/boxo/blob/319662c/ipld/unixfs/io/directory.go#L438

This inconsistency meant a directory exactly at the 256 KiB threshold
would stay basic in JS but convert to HAMT in Go, producing different
CIDs for the same input.

This commit changes Go to use `>` (matching JS), so a directory exactly
at the threshold now stays as a basic (flat) directory. This aligns
cross-implementation behavior for CID determinism per IPIP-499.

Also adds SizeEstimationMode to MkdirOpts so MFS directories respect
the configured estimation mode instead of always using the global default.
@lidel lidel force-pushed the feat/ipip-499-unixfs-2025 branch from b844fc9 to 6707376 Compare January 19, 2026 04:37
lidel added 2 commits January 20, 2026 00:03
- fix trailing newline in directory_test.go
- add #1088 PR references to changelog entries
- files: fix nil filter check in serialFile.Size()
- unixfs/io: document thread-safety for global vars and ApplyGlobals
- changelog: move DefaultBlockSize to Changed section with breaking marker
// Thread safety: this function modifies global variables and is not safe
// for concurrent use. Call it once during program initialization, before
// starting any imports. Do not call from multiple goroutines.
func (p UnixFSProfile) ApplyGlobals() {
Copy link
Member Author

@lidel lidel Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ boxo already had globals (DefaultBlockSize, DefaultLinksPerBlock etc)

this is a very surgical way of having predefined UnixFS profiles in boxo itself that users can apply programmatically at startup of their app.

i dont like it tbh, but others in golang that we already use (like certmagic, or even net.DefaultResolver) have similar way of managing global defaults, so maybe its just me not being a fan of globals.

this is a compromise which delivers ability to set-and-forget profile on startup, but implemented in smallest amount of code that avoids breaking every user of existing APIs

@lidel lidel marked this pull request as ready for review January 20, 2026 02:14
@lidel lidel requested a review from a team as a code owner January 20, 2026 02:14
Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a circular symlink test to show that it works when not dereferencing symlinks, otherwise it fails.

@lidel
Copy link
Member Author

lidel commented Jan 27, 2026

Triage:

  • now that we agreed on SizeEstimationBlock we can optimize to avoid unnecessary calculations and also add tests for edge cases like mode/mtime changes

@lidel lidel marked this pull request as draft January 27, 2026 16:12
lidel added 3 commits January 27, 2026 18:24
IPIP-499 block-bytes estimation improvements:

- add fast path optimization in needsToSwitchByBlockSize to skip
  expensive exact calculation when clearly above threshold (+256 margin)
- clarify documentation for linkSerializedSize, calculateBlockSize,
  cachedBlockSize, and SetStat methods with IPIP-499 context
- extract saveAndRestoreGlobals as package-level test helper

tests for mode/mtime block size overhead:
- verify exact protobuf overhead: mode (3 bytes), mtime seconds (8 bytes),
  nanoseconds (5 bytes), combined (16 bytes)
- verify cachedBlockSize accuracy after add/remove/replace operations
- verify linkSerializedSize matches actual link contribution
- verify HAMT threshold accounts for metadata overhead
- test fast path and near-boundary exact calculation behavior
…utable

consolidate fragmented size tracking into a single method and field:
- merge `cachedBlockSize` into `estimatedSize` (single field for all modes)
- replace `addToEstimatedSize`, `removeFromEstimatedSize`, and
  `updateCachedBlockSize` with unified `updateEstimatedSize(name, oldLink, newLink)`
- remove `SetSizeEstimationMode` from Directory interface; mode is now
  set only at creation time via `WithSizeEstimationMode` option

this prevents mode changes after directory creation which could cause
size tracking inconsistencies, and simplifies the calling code from
two method calls per operation to one.

test coverage:
- TestHAMTToBasicDowngrade: new test for HAMT->Basic threshold boundaries
  covering both SizeEstimationLinks and SizeEstimationBlock modes
- TestEstimatedSizeAccuracy: verifies size tracking after add/remove/replace
- TestProfileHAMTThresholdBehavior: upgrade threshold boundaries
- TestDynamicDirectorySwitch: Basic<->HAMT conversions
removes the need for protobuf serialization when checking HAMT threshold.
the block size is now computed arithmetically from protobuf field definitions:
- dataFieldSerializedSize(): UnixFS Data field (Type + optional mode/mtime)
- linkSerializedSize(): PBLink fields (Hash, Name, Tsize) + wrapper

this replaces the previous approach that serialized a temporary node copy
when near the threshold boundary. the arithmetic calculation is exact and
verified against actual serialization in TestDataFieldSerializedSizeMatchesActual.

calculateBlockSize() moved to test-only code in profile_test.go.
@lidel lidel force-pushed the feat/ipip-499-unixfs-2025 branch from 19fb7a1 to 6141039 Compare January 27, 2026 20:29
@lidel lidel marked this pull request as ready for review January 28, 2026 00:18
lidel added 2 commits February 2, 2026 03:42
add per-directory HAMT sharding size threshold support:
- add hamtShardingSize field to BasicDirectory and HAMTDirectory
- add Get/SetHAMTShardingSize() methods to Directory interface
- add getEffectiveShardingSize() helper for per-directory or global fallback
- propagate HAMTShardingSize to child directories in cacheNode/setNodeData
- add HAMTShardingSize to MkdirOpts with parent inheritance

add RootOptions:
- WithMaxHAMTFanout(n) sets HAMT bucket width
- WithHAMTShardingSize(size) sets per-directory size threshold

add tests:
- TestRootOptionMaxHAMTFanout
- TestRootOptionHAMTShardingSize
- TestHAMTShardingSizeInheritance
expandSparse creates zero-padding when writing past end of file,
but wasn't updating dm.curNode to point to the new node.
subsequent writes would use the old unexpanded node, losing data.
@lidel lidel force-pushed the feat/ipip-499-unixfs-2025 branch from 3ed97b0 to 7884ae2 Compare February 3, 2026 00:32
- use Go 1.22+ range-over-int syntax in IPIP-499 tests
- expand ipld/unixfs/io package documentation with overview of
  directory types, HAMT sharding config, and IPIP-499 profiles
@lidel lidel marked this pull request as ready for review February 3, 2026 15:32
Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one question about checking Mode() when converting to leaf node. Otherwise, looks good.

lidel added 4 commits February 3, 2026 18:00
directories with mode/mtime metadata (set via WithStat) would lose this
optional metadata when:
- converting from BasicDirectory to HAMTDirectory (during sharding)
- converting from HAMTDirectory to BasicDirectory (when shrinking)
- reloading a HAMT directory from disk

root cause: HAMT shards did not support mode/mtime in their UnixFS data,
and the conversion functions did not propagate these fields.

changes:
- add HAMTShardDataWithStat() to include mode/mtime in HAMT shard nodes
- add SetStat() to hamt.Shard to store metadata for serialization
- propagate mode/mtime and SizeEstimationMode during Basic<->HAMT
  conversions in DynamicDirectory
- extract mode/mtime from fsNode when loading HAMT via NewDirectoryFromNode

also adds tests for: negative mtime encoding, SizeEstimationDisabled with
maxLinks=0, unicode filenames in size estimation, concurrent HAMT
conversion, mode/mtime preservation after reload, and exact HAMT threshold
boundary behavior.
when calling Mkdir with Mkparents=true and a custom Chunker, intermediate
directories would inherit the chunker from root instead of using the one
specified in MkdirOpts. now parentsOpts includes opts.Chunker so all
directories created in the path use the same chunker.
- fix gofumpt formatting: comment alignment with unicode chars, octal literal
- rename TestConcurrentHAMTConversion to TestSequentialHAMTConversion and
  serialize operations to avoid race condition (Directory is not thread-safe
  for concurrent reads and writes)
move the maxLinks check into the error handling block to eliminate
the intermediate `existed` boolean variable. the logic is equivalent
but more idiomatic: check maxLinks only when RemoveChild returns
ErrNotExist (new entry), skip it when removal succeeds (replacement).

adds test for replacement behavior at maxLinks capacity.

suggested by @gammazero in #1088 review:
#1088 (comment)
lidel added 2 commits February 3, 2026 23:32
previously maybeCollapseToRawLeaf only checked ModTime when deciding
whether to keep a ProtoNode wrapper. files with Mode metadata (unix
permissions) but no ModTime would incorrectly collapse to RawNode,
losing the permission information.

now checks both ModTime and Mode before collapsing:
  if !fsn.ModTime().IsZero() || fsn.Mode() != 0 {

also refactored metadata preservation tests into a table-driven test
covering ModTime-only, Mode-only, and both metadata fields.

suggested-by: @gammazero
ref: #1088 (comment)
moved package docs from dagmodifier.go to dedicated doc.go file.
expanded documentation to cover:

- MFS semantics for metadata handling
- clarification that Mode and ModTime are optional (most use cases
  do not set them)
- mtime update behavior on content modification (matches Unix fs)
- identity CID handling
- RawNode growth conversion
- raw leaf collapsing behavior

also added inline comments at mtime update sites explaining the
behavior and how to preserve specific mtime values if needed.
@lidel
Copy link
Member Author

lidel commented Feb 3, 2026

Addressed review feedback:

  • added fsn.Mode() check in maybeCollapseToRawLeaf per @gammazero's suggestion
  • added doc.go to ipld/unixfs/mod documenting MFS semantics and metadata handling (just to make it clear its intended, and not accidental design)
  • added table-driven tests for Mode/ModTime metadata preservation just to be extra safe
  • clarified that Mode/ModTime are optional (most use cases don't set them)

All CI checks passing. If no concerns I will merge tomorrow to unblock Kubo 0.40 RC1 (ipfs/go-ipfs-cmds#315, ipfs/kubo#11148)

Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

lidel added 2 commits February 4, 2026 20:42
AddChild HAMT->Basic was missing WithSizeEstimationMode, and
RemoveChild HAMT->Basic was missing WithMaxHAMTFanout. This caused
settings to be lost when directories converted between types.

adds test verifying all settings (MaxLinks, MaxHAMTFanout,
SizeEstimationMode, HAMTShardingSize, CidBuilder) are preserved
in both conversion directions.
…alues

add upfront validation for WithMaxHAMTFanout option instead of silently
falling back to default. NewBasicDirectory and NewHAMTDirectory now
return ErrInvalidHAMTFanout when an invalid value is provided.

valid values must be a positive power of 2 AND multiple of 8
(e.g., 8, 16, 32, 64, 128, 256). use 0 to explicitly request default.

this is a cosmetic improvement: previously invalid values like 2, 4, or 7
would silently fall back to DefaultShardWidth with a warning log. now
the error is returned explicitly, making misconfiguration easier to detect.
@lidel
Copy link
Member Author

lidel commented Feb 4, 2026

Thanks for reviews!

Probably as good as it can get, if we missed anything, we should catch it during RC1.

I'm merging and will continue in ipfs/go-ipfs-cmds#315 and then ipfs/kubo#11148

@lidel lidel merged commit f188f79 into main Feb 4, 2026
16 checks passed
@lidel lidel deleted the feat/ipip-499-unixfs-2025 branch February 4, 2026 20:32
lidel added a commit to ipfs/go-ipfs-cmds that referenced this pull request Feb 4, 2026
switches to boxo@main after merging ipfs/boxo#1088
lidel added a commit to ipfs/go-ipfs-cmds that referenced this pull request Feb 4, 2026
#315)

* feat: add --dereference-symlinks flag for recursive symlink resolution

add new --dereference-symlinks boolean flag that recursively resolves
all symlinks to their target content during file collection. this works
on symlinks inside directories, not just CLI arguments.

the flag is wired through cli/parse.go to boxo's SerialFileOptions.DereferenceSymlinks.

deprecate --dereference-args which only worked on symlinks passed directly
as CLI arguments. the help text now indicates it is deprecated and directs
users to use --dereference-symlinks instead.

ref: ipfs/specs#499

* fix: make --dereference-symlinks resolve CLI arg symlinks too

--dereference-symlinks is now a superset of --dereference-args:
- resolves symlinks passed as CLI arguments (like --dereference-args)
- ALSO resolves symlinks found during directory traversal (new behavior)

this allows users to use just --dereference-symlinks instead of needing
to pass both flags for full symlink resolution.

* chore: update to rebased boxo PR

updates github.com/ipfs/boxo to 56cf0aecdc1a (feat/ipip-499-unixfs-2025 rebased on main)

* fix: reuse derefSymlinks variable, fix typo in deprecation notice

* chore: update boxo to f188f79fd412

switches to boxo@main after merging ipfs/boxo#1088
lidel added a commit to ipfs/kubo that referenced this pull request Feb 4, 2026
switches to boxo@main after merging ipfs/boxo#1088
lidel added a commit to ipfs/kubo that referenced this pull request Feb 4, 2026
* feat(config): Import.* and unixfs-v1-2025 profile

implements IPIP-499: add config options for controlling UnixFS DAG
determinism and introduces `unixfs-v1-2025` and `unixfs-v0-2015`
profiles for cross-implementation CID reproducibility.

changes:
- add Import.* fields: HAMTDirectorySizeEstimation, SymlinkMode,
  DAGLayout, IncludeEmptyDirectories, IncludeHidden
- add validation for all Import.* config values
- add unixfs-v1-2025 profile (recommended for new data)
- add unixfs-v0-2015 profile (alias: legacy-cid-v0)
- remove deprecated test-cid-v1 and test-cid-v1-wide profiles
- wire Import.HAMTSizeEstimationMode() to boxo globals
- update go.mod to use boxo with SizeEstimationMode support

ref: https://specs.ipfs.tech/ipips/ipip-0499/

* feat(add): add --dereference-symlinks, --empty-dirs, --hidden CLI flags

add CLI flags for controlling file collection behavior during ipfs add:

- `--dereference-symlinks`: recursively resolve symlinks to their target
  content (replaces deprecated --dereference-args which only worked on
  CLI arguments). wired through go-ipfs-cmds to boxo's SerialFileOptions.
- `--empty-dirs` / `-E`: include empty directories (default: true)
- `--hidden` / `-H`: include hidden files (default: false)

these flags are CLI-only and not wired to Import.* config options because
go-ipfs-cmds library handles input file filtering before the directory
tree is passed to kubo. removed unused Import.UnixFSSymlinkMode config
option that was defined but never actually read by the CLI.

also:
- wire --trickle to Import.UnixFSDAGLayout config default
- update go-ipfs-cmds to v0.15.1-0.20260117043932-17687e216294
- add SYMLINK HANDLING section to ipfs add help text
- add CLI tests for all three flags

ref: ipfs/specs#499

* test(add): add CID profile tests and wire SizeEstimationMode

add comprehensive test suite for UnixFS CID determinism per IPIP-499:
- verify exact HAMT threshold boundary for both estimation modes:
  - v0-2015 (links): sum(name_len + cid_len) == 262144
  - v1-2025 (block): serialized block size == 262144
- verify HAMT triggers at threshold + 1 byte for both profiles
- add all deterministic CIDs for cross-implementation testing

also wires SizeEstimationMode through CLI/API, allowing
Import.UnixFSHAMTSizeEstimation config to take effect.

bumps boxo to ipfs/boxo@6707376 which aligns HAMT threshold with
JS implementation (uses > instead of >=), fixing CID determinism
at the exact 256 KiB boundary.

* feat(add): --dereference-symlinks now resolves all symlinks

Previously, resolving symlinks required two flags:
- --dereference-args: resolved symlinks passed as CLI arguments
- --dereference-symlinks: resolved symlinks inside directories

Now --dereference-symlinks handles both cases. Users only need one flag
to fully dereference symlinks when adding files to IPFS.

The deprecated --dereference-args still works for backwards compatibility
but is no longer necessary.

* chore: update boxo and improve changelog

- update boxo to ebdaf07c (nil filter fix, thread-safety docs)
- simplify changelog for IPIP-499 section
- shorten test names, move context to comments

* chore: update boxo to 5cf22196

* chore: apply suggestions from code review

Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>

* test(add): verify balanced DAG layout produces uniform leaf depth

add test that confirms kubo uses balanced layout (all leaves at same
depth) rather than balanced-packed (varying depths). creates 45MiB file
to trigger multi-level DAG and walks it to verify leaf depth uniformity.

includes trickle subtest to validate test logic can detect varying depths.

supports CAR export via DAG_LAYOUT_CAR_OUTPUT env var for test vectors.

* chore(deps): update boxo to 6141039ad8ef

switches to ipfs/boxo@6141039

changes since 5cf22196ad0b:
- refactor(unixfs): use arithmetic for exact block size calculation
- refactor(unixfs): unify size tracking and make SizeEstimationMode immutable
- feat(unixfs): optimize SizeEstimationBlock and add mode/mtime tests

also clarifies that directory sharding globals affect both `ipfs add` and MFS.

* test(cli): improve HAMT threshold tests with exact +1 byte verification

- add UnixFSDataType() helper to directly check UnixFS type via protobuf
- refactor threshold tests to use exact +1 byte calculations instead of +1 file
- verify directory type directly (ft.TDirectory vs ft.THAMTShard) instead of
  inferring from link count
- clean up helper function signatures by removing unused cidLength parameter

* test(cli): consolidate profile tests into cid_profiles_test.go

remove duplicate profile threshold tests from add_test.go since they
are fully covered by the data-driven tests in cid_profiles_test.go.

changes:
- improve test names to describe what threshold is being tested
- add inline documentation explaining each test's purpose
- add byte-precise helper IPFSAddDeterministicBytes for threshold tests
- remove ~200 lines of duplicated test code from add_test.go
- keep non-profile tests (pinning, symlinks, hidden files) in add_test.go

* chore: update to rebased boxo and go-ipfs-cmds PRs

* docs: add HAMT threshold fix details to changelog

* feat(mfs): use Import config for CID version and hash function

make MFS commands (files cp, files write, files mkdir, files chcid)
respect Import.CidVersion and Import.HashFunction config settings
when CLI options are not explicitly provided.

also add tests for:
- files write respects Import.UnixFSRawLeaves=true
- single-block file: files write produces same CID as ipfs add
- updated comments clarifying CID parity with ipfs add

* feat(files): wire Import.UnixFSChunker and UnixFSDirectoryMaxLinks to MFS

`ipfs files` commands now respect these Import.* config options:
- UnixFSChunker: configures chunk size for `files write`
- UnixFSDirectoryMaxLinks: triggers HAMT sharding in `files mkdir`
- UnixFSHAMTDirectorySizeEstimation: controls size estimation mode

previously, MFS used hardcoded defaults ignoring user config.

changes:
- config/import.go: add UnixFSSplitterFunc() returning chunk.SplitterGen
- core/node/core.go: pass chunker, maxLinks, sizeEstimationMode to
  mfs.NewRoot() via new boxo RootOption API
- core/commands/files.go: pass maxLinks and sizeEstimationMode to
  mfs.Mkdir() and ensureContainingDirectoryExists(); document that
  UnixFSFileMaxLinks doesn't apply to files write (trickle DAG limitation)
- test/cli/files_test.go: add tests for UnixFSDirectoryMaxLinks and
  UnixFSChunker, including CID parity test with `ipfs add --trickle`

related: boxo@54e044f1b265

* feat(files): wire Import.UnixFSHAMTDirectoryMaxFanout and UnixFSHAMTDirectorySizeThreshold

wire remaining HAMT config options to MFS root:
- Import.UnixFSHAMTDirectoryMaxFanout via mfs.WithMaxHAMTFanout
- Import.UnixFSHAMTDirectorySizeThreshold via mfs.WithHAMTShardingSize

add CLI tests:
- files mkdir respects Import.UnixFSHAMTDirectoryMaxFanout
- files mkdir respects Import.UnixFSHAMTDirectorySizeThreshold
- config change takes effect after daemon restart

add UnixFSHAMTFanout() helper to test harness

update boxo to ac97424d99ab90e097fc7c36f285988b596b6f05

* fix(mfs): single-block files in CIDv1 dirs now produce raw CIDs

problem: `ipfs files write` in CIDv1 directories wrapped single-block
files in dag-pb even when raw-leaves was enabled, producing different
CIDs than `ipfs add --raw-leaves` for the same content.

fix: boxo now collapses single-block ProtoNode wrappers (with no
metadata) to RawNode in DagModifier.GetNode(). files with mtime/mode
stay as dag-pb since raw blocks cannot store UnixFS metadata.

also fixes sparse file writes where writing past EOF would lose data
because expandSparse didn't update the internal node pointer.

updates boxo to v0.36.1-0.20260203003133-7884ae23aaff
updates t0250-files-api.sh test hashes to match new behavior

* chore(test): use Go 1.22+ range-over-int syntax

* chore: update boxo to c6829fe26860

- fix typo in files write help text
- update boxo with CI fixes (gofumpt, race condition in test)

* chore: update go-ipfs-cmds to 192ec9d15c1f

includes binary content types fix: gzip, zip, vnd.ipld.car, vnd.ipld.raw,
vnd.ipfs.ipns-record

* chore: update boxo to 0a22cde9225c

includes refactor of maxLinks check in addLinkChild (review feedback).

* ci: fix helia-interop and improve caching

skip '@helia/mfs - should have the same CID after creating a file' test
until helia implements IPIP-499 (tracking: ipfs/helia#941)

the test fails because kubo now collapses single-block files to raw CIDs
while helia explicitly uses reduceSingleLeafToSelf: false

changes:
- run aegir directly instead of helia-interop binary (binary ignores --grep flags)
- cache node_modules keyed by @helia/interop version from npm registry
- skip npm install on cache hit (matches ipfs-webui caching pattern)

* chore: update boxo to 1e30b954

includes latest upstream changes from boxo main

* chore: update go-ipfs-cmds to 1b2a641ed6f6

* chore: update boxo to f188f79fd412

switches to boxo@main after merging ipfs/boxo#1088

* chore: update go-ipfs-cmds to af9bcbaf5709

switches to go-ipfs-cmds@master after merging ipfs/go-ipfs-cmds#315

---------

Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants